Enhancing Parallelism by Removing Cyclic Data Dependencies

نویسندگان

Fubo Zhang

Erik H. D'Hollander

چکیده

The parallel execution of loop iterations often is inhibited by recurrence relations on scalar variables. Examples are the use of induction variables and recursive functions. Due to the cyclic dependence between the iterations, these loops have to be executed sequentially. A method is presented to convert a family of coupled linear recurrence relations into explicit functions of a loop index. When the cyclic dependency is the only factor preventing a parallel execution, the conversion eeectively removes the dependency and allows the loop to be executed in parallel. The technique is based on constructing and solving a set of coupled linear diierence equations at compile-time. The method is general for an arbitrary number of coupled scalar variables and can be implemented by a straightforward algorithm. Results show that the parallelism of several sequential EISPACK do-loops is signiicantly enhanced by the converting them into do-all loops.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Implementation of an Audio Codec (AMR-WB) using Data Flow Programming Language CAL in the OpenDF Environment Design and Implementation of an Audio Codec (AMR-WB) using Data Flow Programming Language CAL in the OpenDF Environment

Over the last three decades, computer architects have been able to achieve an increase in performance for single processors by, e.g., increasing clock speed, introducing cache memories and using instruction level parallelism. However, because of power consumption and heat dissipation constraints, this trend is going to cease. In recent times, hardware engineers have instead moved to new chip ar...

متن کامل

SIRA: Schedule Independent Register Allocation for Software Pipelining

The register allocation in loops is generally carried out after or during the software pipelining process. This is because doing the register allocation at first step without assuming a schedule lacks the information of interferences between values live ranges. The register allocator introduces extra false dependencies which reduces dramatically the original ILP (Instruction Level Parallelism)....

متن کامل

PaRSEC: A programming paradigm exploiting heterogeneity for enhancing scalability

New HPC system designs with steeply escalating processor and core counts, burgeoning heterogeneity and accelerators, and increasingly unpredictable memory access times, call for one or more dramatically new programming paradigms. These new approaches must react and adapt quickly to unexpected contentions and delays, and they must provide the execution environment with sufficient intelligence an...

متن کامل

Impact of Software Bypassing on Instruction Level Parallelism and Register File Traffic

Software bypassing is a technique that allows programmercontrolled direct transfer of results of computations to the operands of data dependent operations, possibly removing the need to store some values in general purpose registers, while reducing the number of reads from the register file. Software bypassing also improves instruction level parallelism by reducing the number of false dependenc...

متن کامل

Parallélisme des nids de boucles pour l'optimisation du temps d'exécution et de la taille du code. (Nested loop parallelism to optimize execution time and code size)

The real time implementation algorithms always include nested loops which require important execution times. Thus, several nested loop parallelism techniques have been proposed with the aim of decreasing their execution times. These techniques can be classified in terms of granularity, which are the iteration level parallelism and the instruction level parallelism. In the case of the instructio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1994

Enhancing Parallelism by Removing Cyclic Data Dependencies

نویسندگان

چکیده

منابع مشابه

Design and Implementation of an Audio Codec (AMR-WB) using Data Flow Programming Language CAL in the OpenDF Environment Design and Implementation of an Audio Codec (AMR-WB) using Data Flow Programming Language CAL in the OpenDF Environment

SIRA: Schedule Independent Register Allocation for Software Pipelining

PaRSEC: A programming paradigm exploiting heterogeneity for enhancing scalability

Impact of Software Bypassing on Instruction Level Parallelism and Register File Traffic

Parallélisme des nids de boucles pour l'optimisation du temps d'exécution et de la taille du code. (Nested loop parallelism to optimize execution time and code size)

عنوان ژورنال:

اشتراک گذاری